Abstract
This article deals with the right-tail behavior of a response distribution \(F_Y\) conditional on a regressor vector \({\mathbf {X}}={\mathbf {x}}\) restricted to the heavy-tailed case of Pareto-type conditional distributions \(F_Y(y|\ {\mathbf {x}})=P(Y\le y|\ {\mathbf {X}}={\mathbf {x}})\), with heaviness of the right tail characterized by the conditional extreme value index \(\gamma ({\mathbf {x}})>0\). We particularly focus on testing the hypothesis \({\mathscr {H}}_{0,tail}:\ \gamma ({\mathbf {x}})=\gamma _0\) of constant tail behavior for some \(\gamma _0>0\) and all possible \({\mathbf {x}}\). When considering \({\mathbf {x}}\) as a time index, the term trend analysis is commonly used. In the recent past several such trend analyses in extreme value data have been published, mostly focusing on time-varying modeling of location or scale parameters of the response distribution. In many such environmental studies a simple test against trend based on Kendall’s tau statistic is applied. This test is powerful when the center of the conditional distribution \(F_Y(y|{\mathbf {x}})\) changes monotonically in \({\mathbf {x}}\), for instance, in a simple location model \(\mu ({\mathbf {x}})=\mu _0+x\cdot \mu _1\), \({\mathbf {x}}=(1,x)'\), but the test is rather insensitive against monotonic tail behavior, say, \(\gamma ({\mathbf {x}})=\eta _0+x\cdot \eta _1\). This has to be considered, since for many environmental applications the main interest is on the tail rather than the center of a distribution. Our work is motivated by this problem and it is our goal to demonstrate the opportunities and the limits of detecting and estimating non-constant conditional heavy-tail behavior with regard to applications from hydrology. We present and compare four different procedures by simulations and illustrate our findings on real data from hydrology: weekly maxima of hourly precipitation from France and monthly maximal river flows from Germany.
Similar content being viewed by others
References
Angrist J, Chernozhukov V, Fernández-Val I (2006) Quantile regression under misspecification, with an application to the US wage structure. Econometrica 74(2):539–563. http://www.jstor.org/stable/3598810
Beirlant J, Goegebeur Y, Segers J, Teugels J (2006) Statistics of extremes: theory and applications. Wiley, Hoboken
Bernard E, Naveau P, Vrac M, Mestre O (2013) Clustering of maxima: spatial dependencies among heavy rainfall in France. J Clim 26(20):7929–7937. doi:10.1175/JCLI-D-12-00836.1
Bickel PJ, Lehmann EL (1975) Descriptive statistics for nonparametric models ii. Location. Ann Stat 3(5):1045–1069. doi:10.1214/aos/1176343240
Bucher A, Kinsvater P, Kojadinovic I (2015) Detecting breaks in the dependence of multivariate extreme-value distributions. ArXiv 1505:00954
Chavez-Demoulin V, Davison AC (2005) Generalized additive modelling of sample extremes. J R Stat Soc Ser C 54(1):207–222. doi:10.1111/j.1467-9876.2005.00479.x
Chebana F, Ouarda TB, Duong TC (2013) Testing for multivariate trends in hydrologic frequency analysis. J Hydrol 486:519–530. doi:10.1016/j.jhydrol.2013.01.007; http://www.sciencedirect.com/science/article/pii/S00221694130
Chernozhukov V, Fernndez-Val I, Galichon A (2010) Quantile and probability curves without crossing. Econometrica 78(3):1093–1125. doi:10.3982/ECTA7880
Cunnane C (1973) A particular comparison of annual maxima and partial duration series methods of flood frequency prediction. J Hydrol 18(3):257–271. doi:10.1016/0022-1694(73)90051-6; http://www.sciencedirect.com/science/article/pii/002216947390
de Haan L, Tank A, Neves C (2015) On tail trend detection: modeling relative risk. Extremes 18(2):141–178. doi:10.1007/s10687-014-0207-8
de Haan L, Ferreira A (2006) Extreme value theory: an introduction. Springer, Zurich
Dierckx G (2011) Trends and change points in the tail behaviour of a heavy tailed distribution. In: Proceedings of 58th world statistical congress (ISI2011), Dublin. pp 290–299
Dierckx G, Teugels JL (2010) Change point analysis of extreme values. Environmetrics 21(7–8):661–686. doi:10.1002/env.1041
Dupuis DJ, Sun Y, Wang HJ (2015) Detecting change-points in extremes. Stat Interface 8(1):19–31. doi:10.4310/SII.2015.v8.n1.a3
Einmahl JHJ, de Haan L, Zhou C (2016) Statistics of heteroscedastic extremes. J R Stat Soc Ser B 78(1):31–51. doi:10.1111/rssb.12099
Gardes L, Girard S (2010) Conditional extremes from heavy-tailed distributions: an application to the estimation of extreme rainfall return levels. Extremes 13(2):177–204. doi:10.1007/s10687-010-0100-z
Gomes MI, Pestana D (2007) A sturdy reduced-bias extreme quantile (var) estimator. J Am Stat Assoc 102(477):280–292. doi:10.1198/016214506000000799
Hill BM (1975) A simple general approach to inference about the tail of a distribution. Ann Stat 3(5):1163–1174. doi:10.1214/aos/1176343247
Jarušková D, Rencová M (2008) Analysis of annual maximal and minimal temperatures for some European cities by change point methods. Environmetrics 19(3):221–233. doi:10.1002/env.865
Kendall MG (1948) Rank correlation methods. Charles Griffin, London
Kim M, Lee S (2009) Test for tail index change in stationary time series with pareto-type marginal distribution. Bernoulli 15(2):325–356. doi:10.3150/08-BEJ157
Koenker R (2005) Quantile regression. Cambridge University Press, Cambridge
Koenker R, Bassett JG (1978) Regression quantiles. Econometrica 46(1):33–50. http://www.jstor.org/stable/1913643
Kojadinovic I, Naveau P (2015) Nonparametric tests for change-point detection in the distribution of block maxima based on probability weighted moments. ArXiv 1507:06121
Lekina A, Chebana F, Ouarda T (2014) Weighted estimate of extreme quantile: an application to the estimation of high flood return periods. Stoch Environ Res Risk Assess 28(2):147–165. doi:10.1007/s00477-013-0705-2
Madsen H, Rosbjerg D (1997) The partial duration series method in regional index-flood modeling. Water Resour Res 33(4):737–746. doi:10.1029/96WR03847
Mediero L, Santillán D, Garrote L, Granados A (2014) Detection and attribution of trends in magnitude, frequency and timing of floods in spain. J Hydrol 517:1072–1088. doi:10.1016/j.jhydrol.2014.06.040; http://www.sciencedirect.com/science/article/pii/S00221694140
Mu Y, He X (2007) Power transformation toward a linear regression quantile. J Am Stat Assoc 102(477):269–279. http://www.jstor.org/stable/27639838url
Renard B, Lang M, Bois P (2006) Statistical analysis of extreme events in a non-stationary context via a bayesian framework: case study with peak-over-threshold data. Stoch Environ Res Risk Assess 21(2):97–112. doi:10.1007/s00477-006-0047-4
Resnick SI (2007) Heavy-tail phenomena: probabilistic and statistical modeling. Springer, New York
Ribatet M, Sauquet E, Grésillon JM, Ouarda TBMJ (2007) A regional Bayesian POT model for flood frequency analysis. Stoch Environ Res Risk Assess 21(4):327–339. doi:10.1007/s00477-006-0068-z
Roth M, Jongbloed G, Buishand T (2016) Threshold selection for regional peaks-over-threshold data. J Appl Stat 43(7):1291–1309. doi:10.1080/02664763.2015.1100589
Rulfov Z, Buishand A, Roth M, Kysel J (2016) A two-component generalized extreme value distribution for precipitation frequency analysis. J Hydrol 534:659–668. doi:10.1016/j.jhydrol.2016.01.032; http://www.sciencedirect.com/science/article/pii/S0022169416000500
Schumann A (2005) Hochwasserstatistische bewertung des augusthochwassers 2002 im einzugsgebiet der mulde unter anwendung der saisonalen statistik. Hydrologie und Wasserbewirtschaftung 49(4):200–206
Silva AT, Naghettini M, Portela MM (2016) On some aspects of peaks-over-threshold modeling of floods under nonstationarity using climate covariates. Stoch Environ Res Risk Assess 30(1):207–224. doi:10.1007/s00477-015-1072-y
Strupczewski WG, Kochanek K, Bogdanowicz E, Markiewicz I (2012) On seasonal approach to flood frequency modelling. Part i: two-component distribution revisited. Hydrol Process 26(5):705–716. doi:10.1002/hyp.8179
Tabari H, Taye MT, Willems P (2015) Statistical assessment of precipitation trends in the upper Blue Nile river basin. Stoch Environ Res Risk Assess 29(7):1751–1761. doi:10.1007/s00477-015-1046-0
Teugels JL, Vanroelen G (2004) Box-cox transformations and heavy-tailed distributions. J Appl Probab 41:213–227. http://www.jstor.org/stable/3215978
van der Vaart AW, Wellner JA (1996) Weak convergence and empirical processes—Springer series in statistics. Springer, New York
Wang HJ, Li D, He X (2012) Estimation of high conditional quantiles for heavy-tailed distributions. J Am Stat Assoc 107(500):1453–1464. doi:10.1080/01621459.2012.716382
Wang HJ, Li D (2013) Estimation of extreme conditional quantiles through power transformation. J Am Stat Assoc 108(503):1062–1074. doi:10.1080/01621459.2013.820134
Wang H, Tsai CL (2009) Tail index regression. J Am Stat Assoc 104(487):1233–1240. doi:10.1198/jasa.2009.tm08458
Weissman I (1978) Estimation of parameters and large quantiles based on the k largest observations. J Am Stat Assoc 73(364):812–815. doi:10.1080/01621459.1978.10480104
Wi S, Valdés JB, Steinschneider S, Kim TW (2016) Non-stationary frequency analysis of extreme precipitation in South Korea using peaks-over-threshold and annual maxima. Stoch Environ Res Risk Assess 30(2):583–606. doi:10.1007/s00477-015-1180-8
Yue S, Pilon P, Cavadias G (2002) Power of the Mann-Kendall and Spearman’s rho tests for detecting monotonic trends in hydrological series. J Hydrol 259(1–4):254–271. doi:10.1016/S0022-1694(01)00594-7, http://www.sciencedirect.com/science/article/pii/S00221694010
Acknowledgements
We would like to thank Professor Andreas Schumann from the Department of Civil Engineering, Ruhr-University Bochume, Germany, for providing us hydrological data and for helpful discussions. We are also grateful to two anonymous referees and an Associate Editor for their constructive comments on an earlier version of our work. The financial support of the Deutsche Forschungsgemeinschaft (SFB 823, “Statistical modelling of nonlinear dynamic processes”) is gratefully acknowledged.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix: Quantile regression process
Let Y denote a random variable called response and \({\mathbf {X}}=(1,X_1,\ldots ,X_d)'\) a random vector called regressor with support covered by a compact set \({\mathscr {X}}\subset {\mathbb {R}}^{d+1}\). Throughout this section we suppose that the conditional distribution \(F(y|{\mathbf {x}})=P(Y\le y|\,{\mathbf {X}}={\mathbf {x}})\) of Y given \({\mathbf {X}}={\mathbf {x}}\) satisfies
for all \({\mathbf {x}}\in {\mathscr {X}}\), probabilities \(p\in I\subset [\varepsilon ,1-\varepsilon ]\) and an unknown vector-valued function \(p\mapsto {\varvec{\beta }}_p\), \(p\in I\), with \({\varvec{\beta }}_p\in {\mathbb {R}}^{d+1}\) called p-th regression quantile (Koenker and Bassett 1978). The left-hand side of (13) is called generalized inverse or quantile of \(F(\cdot |{\mathbf {x}})\) in \(p\in I\). It coincides with the usual inverse of a function, provided the inverse exists. Theoretical aspects and many applications of linear quantile regression are presented in Koenker (2005).
Let \((Y_i,{\mathbf {X}}_i)\), \(i=1,\ldots ,n\), denote independent copies of \((Y,{\mathbf {X}})\). Estimator
with \(\rho _p(y)=y\cdot (p-{\mathbf {1}}_{\{y\le 0\}})\) is called empirical regression quantile. The following result establishes asymptotic normality of \(\sqrt{n}\left(\hat{\varvec{{{\beta}}}}_p-{\varvec{\beta }}_p\right)\) uniformly in \(p\in I\), i.e., in the function space \(\left( \ell ^\infty (I)\right) ^{d+1}\) (van der Vaart and Wellner 1996).
Theorem 1
Suppose that, uniformly in \({{\mathbf {x}}}\in {\mathscr {X}}\) , the conditional density \(f(y|{\mathbf {x}})\) exists, is bounded and uniformly continuous in y. Suppose further that \({\mathbb {E}}\Vert {\mathbf {X}}\Vert ^{2+\delta }<\infty\) for some \(\delta >0\) and that
exist with \(H_p\) positive definite for all \(p\in I\). Then, for \(n\rightarrow \infty\) , we have that
in \(\left( \ell ^\infty (I))\right) ^{d+1}\) , where \({\mathbb {Z}}\) is a centered Gaussian process with \({\mathbb {E}}[{\mathbb {Z}}(p){\mathbb {Z}}(q)']=(p\wedge q-p\cdot q)\cdot J\).
The previous result allows us to estimate the joint distribution of several empirical regression quantiles. Let \({\mathbf {p}}=\{p_1,\ldots ,p_\ell \}\subset I\) denote a set of probabilities. Then, for \(n\rightarrow \infty\) , we immediately obtain that
where \(\Sigma _{{\mathbf {p}}}\) is defined piecewise through
This result is used to prove Proposition 1.
Conditional heavy-tail behavior: competing methods
1.1 Tail index regression (TIR) by Wang and Tsai (2009)
Wang and Tsai (2009) study model (3) with \(\alpha ({\mathbf {x}})=1/\gamma ({\mathbf {x}})=\exp ({\mathbf {x}}'\theta )\) for some unknown parameter vector \(\theta \in {\mathbb {R}}^{d+1}\). They propose the estimator
with regressor independent threshold \(u_n\rightarrow \infty\) for \(n\rightarrow \infty\). (16) can be viewed as an approximate maximum likelihood approach based on the weak approximation of \(\log (Y/u_n)\) given \({\mathbf {X}}={\mathbf {x}}\) and \(Y>u_n\) to an exponential distribution with mean \(1/\alpha ({\mathbf {x}})\). Let \(k=\sum _{i=1}^n{\mathbf {1}}(Y_i>u_n)\) be the effective sample size in (16) and \(\hat{\Sigma }_{u_n}=\frac{1}{k}\sum _{i=1}^\mathbf {X}n{}_i{\mathbf {X}}_i'{\mathbf {1}}(Y_i>u_n)\). Under certain technical assumptions, Wang and Tsai (2009) prove
for some vector \({\mathbf {h}}\) and \((d+1)\)-dimensional identity matrix \(I_{d+1}\). The estimation of the bias \({\mathbf {h}}\) requires detailed information on the tail, which is hardly available and thus set to zero in applications.
However, Wang and Tsai (2009) do not consider regressor dependent thresholds \(u_n\) like in Sect. 2.1, which in practice is important to account for regression effects in e.g. the center of the distribution. In order to reduce this problem, we suggest to apply their estimation procedure on the sample \((Z_{k,j},{\mathbf {X}}_{k,j})\), \(j=1,\ldots ,k\), as given in Sect. 2.1. That is, replace \(\hat{\theta }_{u_n}\) by
and \(\hat{\Sigma }_{u_n}\) by \(\hat{\Sigma }_{k,n}=\frac{1}{k}\sum _{j=1}^k{\mathbf {X}}_{k,j}{\mathbf {X}}_{k,j}'\).
1.2 Three-stage procedure by Wang and Li (2013)
An alternative regression approach focusing on high conditional quantiles \(F^{-1}_Y(p|\ {\mathbf {x}})\), \(p\in [1-\varepsilon ,1)\), for some small number \(\varepsilon >0\) is proposed in Wang and Li (2013). Their method is based on the assumption that
holds for some \(\lambda \in {\mathbb {R}}\), Box-Cox transformation \(g_\lambda\), regression quantiles \(\beta _p\mathbb {R}in {}^{d+1}\) and all \(p\in [1-\varepsilon ,1)\). They propose an estimator of \(\gamma ({\mathbf {x}})\) based on a three-stage procedure:
-
1.
Set \(p=p_{k,n}=\frac{n-k}{n+1}\) and compute \(\hat{\lambda }\) as in Sect. 2.1.
-
2.
Let \(p_{n-j,n}=\frac{j}{n+1}\) for \(j=1,\ldots ,m\) with \(m=n-\lfloor {n}^{\eta} \rfloor\) and \(\eta =0.1\). For \(j=1,\ldots ,m\), estimate \(F_Y^{-1}(p_{n-j,n}|\ {\mathbf {x}})\) by the right hand side of (6) with \(g=g_{\hat{\lambda }}\) and \(p=p_{n-j,n}\). Denote these estimates by \(\hat{q}_j({\mathbf {x}})\), \(j=1,\ldots ,m\). If \(\hat{q}_j({\mathbf {x}})\) is not increasing in j, apply the rearrangement procedure of Chernozhukov et al. (2010).
-
3.
For some integer \(k<m\), estimate \(\gamma ({\mathbf {x}})\) by
$$\hat{\gamma }_{k,n}({\mathbf {x}})=\frac{1}{k-\lfloor n^\eta \rfloor }\sum _{j=\lfloor n^\eta \rfloor }^k\log (\hat{q}_{n-j})-\log (\hat{q}_{n-k}).$$
Thus \(\hat{\gamma }_{k,\mathbf {x}n}({})\) is Hill’s estimator (Hill 1975) applied to the sample of \(\hat{q}({\mathbf {x}})\) values, which can be seen as pseudo observations from \(F_Y(\ \cdot \ |\ {\mathbf {x}})\). Wang and Li (2013) also propose a test statistic
as a test for hypothesis \({\mathscr {H}}_{0,tail}\) in (4). If \({\mathscr {H}}_{0,tail}\), \(E({\mathbf {X}})=(1,0,\ldots ,0)'\in {\mathbb {R}}^{d+1}\) and either \(\gamma ^*({\mathbf {x}})=0\) or a certain homogeneity assumption are met, Wang and Li (2013) show under additional technical assumptions that \(kT_n\xrightarrow{D}\gamma ^2\chi _d^2\) holds. They also derive the limiting distribution under heterogeneity, which in practice involves the estimation of additional parameters. For more details we refer to Wang and Li (2013, Th. 3.3 and Cor. 3.1).
Rights and permissions
About this article
Cite this article
Kinsvater, P., Fried, R. Conditional heavy-tail behavior with applications to precipitation and river flow extremes. Stoch Environ Res Risk Assess 31, 1155–1169 (2017). https://doi.org/10.1007/s00477-016-1345-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-016-1345-0